A severe asthma phenotype of excessive airway Haemophilus influenzae relative abundance associated with sputum neutrophilia

Abstract Background Severe asthma (SA) encompasses several clinical phenotypes with a heterogeneous airway microbiome. We determined the phenotypes associated with a low α‐diversity microbiome. Methods Metagenomic sequencing was performed on sputum samples from SA participants. A threshold of 2 standard deviations below the mean of α‐diversity of mild‐moderate asthma and healthy control subjects was used to define those with an abnormal abundance threshold as relative dominant species (RDS). Findings Fifty‐one out of 97 SA samples were classified as RDSs with Haemophilus influenzae RDS being most common (n = 16), followed by Actinobacillus unclassified (n = 10), Veillonella unclassified (n = 9), Haemophilus aegyptius (n = 9), Streptococcus pseudopneumoniae (n = 7), Propionibacterium acnes (n = 5), Moraxella catarrhalis (n = 5) and Tropheryma whipplei (n = 5). Haemophilus influenzae RDS had the highest duration of disease, more exacerbations in previous year and greatest number on daily oral corticosteroids. Hierarchical clustering of RDSs revealed a C2 cluster (n = 9) of highest relative abundance of exclusively Haemophilus influenzae RDSs with longer duration of disease and higher sputum neutrophil counts associated with enrichment pathways of MAPK, NF‐κB, TNF, mTOR and necroptosis, compared to the only other cluster, C1, which consisted of 7 Haemophilus influenzae RDSs out of 42. Sputum transcriptomics of C2 cluster compared to C1 RDSs revealed higher expression of neutrophil extracellular trap pathway (NETosis), IL6‐transignalling signature and neutrophil activation. Conclusion We describe a Haemophilus influenzae cluster of the highest relative abundance associated with neutrophilic inflammation and NETosis indicating a host response to the bacteria. This phenotype of severe asthma may respond to specific antibiotics.

• The relative dominant species (RDS) in 51 sputum samples of severe asthma patients are described including Haemophilus influenzae, Moraxella catarrhalis and Tropheryma whipplei RDSs.
• Hierarchical clustering revealed 2 clusters, one entirely composed of Haemophilus influenzae with relative abundance of 76.3%.
• This cluster was characterised by long duration of asthma and exacerbations, and enriched NETosis and IL-6 trans-signalling pathways.

INTRODUCTION
Molecular phenotypes characterised by either eosinophilic or neutrophilic inflammation have been described from transcriptomic or proteomic analyses of bronchial biopsies or sputum cells from patients with severe asthma. 1,2he severe eosinophilic asthma associated with Type 2 inflammatory pathways responds well to biologic therapies targeting the cytokines, interleukin (IL)−4, IL-5 and IL-13.However, the phenotypes characterised by a neutrophilic inflammation remain ill-defined, and such phenotypes may be associated with dysbiosis of the airway microbiome as measured by low diversity, which is described in patients with uncontrolled or severe asthma. 3,4Metagenomic analysis of the airway microbiome in asthma has revealed an abundance of Haemophilus influenzae and Moraxella catarrhalis associated with sputum neutrophilia and linked to inflammasome and neutrophil activation, 5 in line with studies that have reported neutrophilic inflammation with an enrichment of Proteobacteria including Moraxella and Haemophilus, and in particular Haemophilus influenzae. 4,6,7Although neutrophils form part of the host mechanism critical for fighting respiratory infections, many opportunistic pathogens have evolved immune mechanisms to evade microbial entrapment and killing. 8n order to define further the different phenotypes of severe asthma more closely, we determined whether these could be dominated by any particular bacterial species.In order to do this, we defined the relative dominant bacterial species in each sputum sample collected based on the impairment of the diversity of the bacterial species defined from the metagenomic analysis of the sputum samples in the severe asthma U-BIOPRED cohort. 9Thus, microbial α-diversity denotes the relative abundance of microbial species in space and time in a biological sample and a decrease has been associated with declining health status. 10 We examined the host molecular pathways associated with these relative dominant species containing specific bacterial species.In so doing, we have defined a restricted phenotype of severe asthma dominated by the highest abundance of airway Haemophilus influenzae with evidence of sputum neutrophilia and host-bacterial interactions.

Participants
The U-BIOPRED cohort consists of adult severe asthmatics (SA) that included nonsmokers and current and/or ex-smokers, and two control groups, mild-moderate asthmatics (MMA) and healthy volunteers (HC), as previously described 9 (Table S1) at baseline.Twelve months later, a subset of severe asthmatics returned for follow-up.This study was approved by the Ethics Committee of each participating clinical institution and adhered to the standards set by International Conference on harmonisation and Good Clinical Practice.All patients gave written informed consent to participate in the study.

Sample collection, DNA/RNA extraction and data preprocessing
Induced sputum samples were obtained following inhalation of hypertonic saline and transcriptomic expression was performed using Affymetrix U133 Plus 2.0 (Affymetrix, Santa Clara, CA, USA) microarrays using RNA extracted from sputum cells.Quality checks were performed according to Affymetrix R recommendations and the expression matrix was derived using the robust microarray analysis (RMA) method from the affy R package. 11etagenomic sequencing analysis was performed on frozen sputum samples from 147 participants (99 SA, 25 MMA, 23 HC) at baseline and 44 SA participants at followup by Second Genome (San Francisco, California, USA) 5 (Figure S2).After quality control and host reads removal of the metagenomic data was performed, samples from 97 SA patients, 25 MMA and 23 HC subjects out of the total 145 were used for further analyses.The MetaPhlAn2 pipeline (version 2.7.2) and its marker database 12 was used to estimate microbiome profiles.Metagenomic functional profiling to estimate the abundance of microbial gene families and pathways from metagenomic data was performed using the HUMAnN2 13 and the UniProt pathway database (UNIPATHWAY database) 14,15 (Supplementary file, Section 4).

Definition of relative dominant species (RDSs)
The relative abundance of a particular bacterial species was derived from the MetaPhlAn2 pipeline (Supplementary file, Section 3).Using a predetermined Shannon's α-diversity threshold, an abundance cut-off was determined for each bacterial species based on the minimum abundance of that species associated with the reduction in Shannon's α-diversity to, or below, this threshold in the SA cohort (Supplementary file, Section 1).For each species, we iterated through a range of abundance values increasing by 0.5 from 0% to 75% and excluded samples that did not have an abundance at or above that value and then calculated the Shannon's α-diversity of the remaining samples.When the Shannon's α-diversity of the remaining samples was found to be at or below the predetermined Shannon's α-diversity threshold, that abundance value was designated as the RDS abundance cut-off for that species.For samples that contained no RDSs, the species was designated as a non-RDS.The algorithm is shown in Figure S1.The predetermined Shannon's α-diversity threshold was calculated using a Z-Score of −2 of Shannon's α-diversity obtained from the combined MMA and HC cohorts (Figures 1 and  S3).

Determination of dominant RDS
For samples with multiple RDSs, a dominant RDS was defined.A factor was created for each species in each sample by dividing the abundance of each species in a given sample by the specific threshold unique to that species established in the RDS definition.The species in a given sample with the highest factor was considered to be the dominant RDS in that sample.

Clustering on abundance of low-diversity samples
Clustering by species compositional abundance was performed to understand the heterogeneity of low-diversity samples.To determine similarity between samples, the Aitcheson distance β-diversity dissimilarity measure 16 was employed because it was suited for compositional data, as the variation between sequence reads for each sample is unknown. 17This was computed on the species-level relative abundance from the metagenomics data of the RDS samples in SA.Hierarchical Ward2 agglomerative clustering on the Aitchison distance was performed and the optimum number of clusters determined using the average silhouette width 18 and the Calinski-Harabasz score. 19Consensus clustering 20 was used to assess stability by resampling, randomly removing 10% of the data and repeating the clustering through 1000 iterations.The empirical cumulative distribution function (CDF) displays the consensus distributions for each value of k derived using the ConsensusClusterPlus package. 21

Longitudinal analysis
Forty-three out of the 44 follow-up samples had corresponding baseline samples.This was used to define the stability of dominant RDSs between baseline and follow-up.

Gene set variation analysis
Gene set variation analysis (GSVA) was performed in R using the Bioconductor GSVA package to estimate sample-wise enrichment of gene signatures. 22

Statistical analysis
We used pandas 1.4.2 23and python 3.9.10 24data analysis tools.Statistical analyses were performed using R version 4.1.1. 25Differentially abundant microbiome species between clusters and differentially prevalent microbial pathway were identified using ANCOM-BC, 26 which was selected due to its suitability for compositional data. 17The Holm-Bonferroni for adjustment was used to control for FDR.Differentially abundant species analysis was performed between clusters generated from the hierarchical clustering of low-diversity RDS samples and MMA/HC group.Differentially expressed gene (DEG) analysis was performed on sputum transcriptomics 27 between different RDS species compared to the MMA and HC groups combined (MMA/HC), and non-RDSs compared to MMA/HC using limma. 28A Benjamini-Hochberg false-discovery rate (FDR) adjustment was applied with FDR < 0.05 and absolute log 2 fold-change ≥0.5 were considered statistically significant in transcriptomic analyses.DEG results were used to perform pathway enrichment using ClusterProfiler. 29Comparison of the clinical variables between groups was performed using analysis of variance (ANOVA) for multiple group comparison of normally distributed variables.Kruskal-Wallis test was used for multiple group comparison of nonnormally distributed variables or ordered categorical and chi-squared test was used for qualitative variables.

RESULTS
Figure S2 shows the Consort diagram with a final repartition of the 97 severe asthma subjects into 2 clusters of RDSs.

Low α-diversity RDSs
The distribution and threshold of the relative dominant species in RDSs are shown in Figure S13.Patients with low α-diversity RDSs had higher sputum neutrophils compared to those with normal diversity with lower sputum macrophages and a greater prevalence of allergic rhinitis (Table S2).

Haemophilus influenzae RDS
Haemophilus influenzae was detectable in 44% (43/97) of SA samples with a mean relative abundance of Note: Presence denotes the percentage of samples where a particular species exists with a greater than zero abundance.Abundance is the median relative abundance of all the samples for a particular species where it exists with a greater than zero abundance.*Data shown as mean ± SD.Abbreviation: RDS: relative dominant species.Relative to the patients with RDSs, sputum transcriptomics identified few DEGs differentially expressed between severe asthma patients with normal diversity and MMA/HC groups.There were no DEGs in sputum transcriptomics between patients with RDSs compared to non-RDSs.Patients with Haemophilus influenzae RDS have 5631 more genes upregulated compared to MMA and non-Hi RDSs (Figure 2A).Pathway analysis of DEGs showed enrichment of positive regulation of natural killer cell-mediated cytotoxicity, positive regulation of type 2a hypersensitivity, antigen processing and toll-like receptor signalling pathway compared to the MMA group (Figure S6A).

Moraxella catarrhalis RDS
Moraxella catarrhalis was detectable in 10% (10/97) of severe asthmatic samples with a mean abundance of 24.98 ± 37.5% and 50% (5/10) of samples met the loss of diversity criteria of RDS.The mean abundance of Moraxella catarrhalis within RDS samples was   49.09 ± 41.35%.Patients with Moraxella catarrhalis RDS had the shortest duration of disease but had the highest sputum neutrophils (Table 2).Sputum eosinophils were low at 1.4% (p = .001compared with non-RDSs comparator) and FeNO was also reduced (p < .05).In addition, these subjects had less exacerbations per year (p < .05compared with non-RDSs).Patients with Moraxella catarrhalis RDS had 1884 genes upregulated compared to MMA (Figure 2B  and C).Gene ontology analysis of DEGs showed positive regulation of granulocyte colony-stimulating factor and macrophage colony stimulating factor production and negative regulation of LPS-mediated signalling pathway and IL-12 and IL-13 production compared to MMA (Figure S6B)

Other species RDSs
Characteristics of patients with sole RDSs of Actinobacillus unclassified, Streptococcus pseudopneumoniae and Veillonella unclassified are shown in Table S4.While there was no difference between Veillonella and non-RDSs, there were more obese mainly nonsmoking severe asthmatics with higher serum C5a in the Actinobacillus RDSs.However, the numbers in these RDSs were small.

Cluster analysis of low diversity samples
Cluster analysis of samples with low diversity identified two clusters as demonstrated on silhouette width (C1 and C2) (Figure S7A and B) and by Calinski-Harabasz score (Figure S7C).Consensus clustering also identified stability with 2 clusters (Figure S7D and E) with C2 having the lowest Shannon α-diversity (Figure 3A), with the greatest degree of separation shown on the principal component analysis plot (Figure 3B).The bee-swarm plot on Figure S11 shows no difference in composition between C1 and C2 clusters and non-RDS in terms of nonhuman sequence reads.
C1 had the highest α-diversity with greatest similarity to MMA and HC subjects combined (MMA/HC), corroborated by the differential abundance analysis with minimal changes in species abundance (Figure 3A and B).C2 was characterised by a large increased abundance of Haemophilus influenzae of 76.3 ± 19.9% compared to 2.1 ± 5.1% for C2 (Figure 4A; Table S5).By contrast, Tropheryma whipplei and Moraxella catarrhalis were more abundant in C1 than in C2.Haemophilus influenzae-exclusive C2 cluster had a longer disease duration compared to C1 (p ≤ .005)(Table 3).Sputum neutrophil counts (%) were higher in C2 (p ≤ .001)but sputum eosinophil counts (%) were higher in C1 (p ≤ .05)and sputum macrophages (%) lower in C2 (p ≤ .001)(Figure 4B).Serum IL-8 was higher in C2 compared to C1 and non-RDS groups (p < .05)(Table 3).Gene ontology pathway analysis of DEGs of sputum between C1 and C2 clusters showed enrichment of pathways in C2 over C1 related to activated MAPK, NF-κB, TNF and mTOR signalling pathways and to necroptosis (Figure S8).
Using gene set variation analysis (GSVA), there was a higher expression score of the gene signature of NETosis, 30 IL6-transignalling signature 31 and neutrophil activation while there was lower expression for the oxidative phosphorylation (OXPHOS) and lung tissue resident macrophage in the sputum transcriptome of C2 compared to C1 (Figure S9).There were no significant differences for the gene signatures representing Th17 activation, 32 blood eosinophil activation and innate lymphoid cell type-2.
Metagenomic functional analysis demonstrates a distinct microbial pathway profile between C1 and C2 (Figure 5A and B) with PERMANOVA of the Bray-Curtis dissimilarity matrix being different (p < .001).Differentially abundant pathway analysis using ANCOM-BC showed that C2 had 119 significantly differentially abundant UniProt subpathways compared to C1 (Figure S10), with differences in amino acid biosynthesis and carbohydrate metabolism and degradation.

Longitudinal analysis of severe asthma samples
From the follow-up cohort, patients with low α-diversity had lower blood eosinophils compared to those with normal diversity (Table S2).24 samples were categorised as being low diversity RDS samples and 19 as non-RDS samples, with 19 samples containing one RDS and 5 samples with two RDSs (Figure S2B). Figure S11 depicts the stability of the dominant RDSs between the baseline and follow-up time-point.
At one-year follow-up, 7 of the 22 non-RDS samples became RDS with dominance of Pseudomonas aeruginosa, Haemophilus aegyptius, Haemophilus parainfluen-zae, Streptococcus mitis oralis pneumoniae and Veillonella unclassified.Thirteen of 21 RDSs samples changed RDS status in terms of bacterial species with a shift for Moraxella catarrhalis (2 out of 2), Haemophilus influenzae (2 out of 5) and for Tropheryma whipplei (1 out of 3).

DISCUSSION
We have quantified the relative abundance of bacterial species in sputum of severe asthma patients by linking it to a state of significantly low diversity defined according to the diversity variance of the mild-moderate asthma and healthy subjects.This approach was taken because lowly abundant taxa are highly vulnerable to major shifts in abundance 33,34 and often disproportionately contribute to the structure and function of their community. 35,36We have therefore determined the sputum samples of those with severe asthma that had a high relative abundance of a particular species labelled as a RDS, in order to focus on the species that are prone to these major abundance shifts.Low α-diversity was accompanied by outgrowths of 25 bacterial species for which many are of unknown pathogenic significance.Of these species known to have pathogenic effects in airways diseases such as asthma and COPD, Haemophilus influenzae was the most prevalent to constitute a RDS (44%), followed by Moraxella catarrhalis and Tropheryma whipplei (14.4% each).Haemophilus influenzae RDS was associated with longest duration of asthma, highest number of exacerbations and highest use of daily OCS therapy, together with high sputum neutrophilia and lower sputum macrophages, with the presence of sputum eosinophilia only in some subjects.On the other hand, the Moraxella catarrhalis RDS was solely associated with sputum neutrophilia but low sputum eosinophils and FeNO and the highest ACQ-5 score whilst the Tropheryma whipplei RDS was characterised by the presence of type 2 inflammation and severe airflow obstruction.Furthermore, clustering of the bacterial RDSs identified a small but select cluster of the highest relative abundance of Haemophilus influenzae who had a long duration of asthma, with sputum neutrophilia, and eosinophil counts and low FeNO levels but with higher levels of serum IL-8, supporting further the association of increased abundance of Haemophilus influenzae with a neutrophilic inflammatory response and low T2 inflammation.Notably, the C2 cluster was also associated with a deficiency of sputum macrophages potentially limiting the efficient clearance of Haemophilus influenzae and this might explain the increased bacterial colonisation of the lungs observed.Thus, the C2 cluster membership of exclusively Haemophilus influenzae RDSs represented a neutrophilic phenotype with lesser eosinophilic inflammation than the Haemophilus influenzae RDSs of the C1 cluster.More importantly, for the first time, we have linked this C2 Haemophilus influenzae RDSs cluster with enrichment of activated pathways associated with MAPK, NF-kB, TNF and mTOR, and of cellular necroptosis, which are consistent with previously reported intracellular activation pathways caused by Haemophilus influenzae on airway epithelial cells. 37,38We propose that the sputum neutrophilia results as a reflection of the host response to detection of the very high relative abundance of Haemophilus influenzae as supported by the presence NETosis which has been recognised as a function of activated neutrophils. 39It is also possible that the presence of neutrophilic inflammation might create a more favourable environment for the expansion of the Haemophilus influenzae species.
More importantly, the gene signature for neutrophil extracellular traps (NETs) was highly expressed in this cluster C2 compared to the C1 cluster using pathway analysis of differentially expressed genes and gene set variation analysis, indicating that a host-bacterial response to the presence of high relative abundance (> 45%) of Haemophilus influenzae in the airways.NETs are weblike scaffolds of DNA complexing with histones and neutrophil granular proteins released from neutrophils upon activation by proinflammatory stimuli, including IL-8 and TNF, and downstream kinases such as MAPK and mTOR.Excessive NETs in severe asthma may perpetuate NETopathic inflammation by damaging the airway epithelium. 40NETs in the context of Haemophilus influenzae infection in COPD can also amplify inflammation through the IL6-transignalling pathway, which was increased in the C2 cluster. 41Similarly, with IL-8 levels found to be elevated in the serum of C2 cluster individuals, this potent chemokine can also induce NET formation in COPD neutrophils via CXCR2 receptor activation. 42n the microbial pathways that differentiated the C2 versus the C1 clusters, we found pathways related to carbohydrate biosynthesis and metabolism, some of which were related to glycolysis.This raises the possibility that Haemophilus influenzae might be using the glycolytic pathway as energy source for growth, as they were the most relatively abundant bacterial species (mean of 76%) and as it is known that Haemophilus influenzae are able to catabolise glucose during both aerobic and anaerobic growth. 43The supply of glucose in airway surface liquid glucose may be increased in chronic respiratory diseases such as asthma, 44 and in the C2 cluster, 2 of the 9 subjects were labelled as diabetic and these subjects were regularly exposed to bursts of OCS therapy for treatment of frequent episodes of asthma exacerbations, making it likely that there was an increased availability of glucose in the airways.
Moraxella catarrhalis RDSs were associated with an enrichment of similar pathways related to the innate immune system as for Haemophilus influenzae RDSs, but with the additional activation of granulocyte and macrophage colony stimulating factors, confirming previously reported associations. 45,46Interestingly, Tropheryma whipplei RDSs had no association with any activated pathways.
Characterising clusters by relative abundance of species means that other species are lost in consequence.Thus, the species lost in the Haemophilus-dominant C2 cluster include Rothia mucilaginosa, which may be antiinflammatory, primarily through inhibition of NF-κB pathway, 47 or bioprotective as its levels were diminished in children at risk of asthma, 48 which may help Haemophilus influenzae to promote airway inflammation 49 through the inhibition of other protective or anti-inflammatory species.
Little is known about the other bacterial species that were also reduced in abundance in the C2 cluster, and also about their clinical significance.
We were interested in understanding the samples with relative dominant species and their significance in terms of the host response.We opted to use an extremely low cut-off for Shannon's α-diversity, which has been linked to increased disease severity.Having determined the samples with RDSs, we then clustered on the basis of the relative abundance of the dominant species to find out whether any bacterial species was particularly associated with host response indicating bacterial-host interactions.This approach we have taken, different from just using the relative abundance directly, has ensured that the microbiome of those with high disease severity would be under focus.Indeed, we found that Haemophilus influenzae was most prominent in relative abundance with those with very high relative abundance associated with activated NETosis, IL-6 trans-signalling pathway and neutrophil activation, indicating a potential host response to the relative high abundance of Haemophilus influenzae.
One of the limitations of the study is the relatively low numbers of samples that constituted an RDS, which is partly due to the stringency of the RDS definition we used, but we managed to obtain the definition of subgroups of Haemophilus influenzae RDSs.The Haemophilus influenzae RDSs, in particular, will need to be confirmed in larger cohorts of severe asthma by using either quantitative PCR methods or by culture methods.Currently, we are not aware of any metagenomic data available in asthma cohorts that we can use for validation of these findings.There were even lower numbers of severe asthma participants at the one-year follow-up that showed that the species RDS profile changes with even non-RDSs later acquiring RDS status in up to 46% of the severe asthmatics.The factors that determine the stability or change in bacterial species abundance will need to be determined.We also recognise that the metagenomic sequencing used here generate compositional datasets reflecting the proportion of counts per feature per sample, such that relative abundance of species is only available.Therefore, the concept of an RDS must be seen as the relative outgrowth of a species within a sample.Finally, another limitation is the lack of information regarding recent antibiotic use.However, all asthma participants were studies at least 6 weeks of any exacerbations when it would have been likely to have been administered an antibiotic, that would have perturbed the sputum microbiome.
Although Haemophilus influenzae has been implicated as being an important potential pathogenic bacterial species in asthma and other airways diseases, 4,6,7 we have delineated precisely for the first time, a restricted phenotype of high relative abundance of Haemophilus influenzae in severe asthma associated with high neutrophil inflammation, an adaptive response of NETosis and IL-6 activation.The clinical implications of the definition of this cluster are that such patients with a relative high abundance of Haemophilus influenzae associated with a neutrophilic inflammation may potentially respond to Haemophilus influenzae vaccine or targeted antibiotic therapy.

A U T H O R C O N T R I B U T I O N S
IMA, JR and KFC conceived the idea; IMA, KFC and RD obtained the funding for U-BIOPRED project; SB, JR and PH obtained the funding for the metagenomic analysis; SB, JR, PH, IMA, MIA, SHC, AV and KFC discussed the approach to data analysis; AV, AA, FXI, MIA and NZK analysed the data; AV, AA and KFC wrote the manuscript; MIA, A-HMZ and SED contributed to its finalisation and all authors agreed with the final version for submission.All authors gave final approval of the manuscript, had full access to all the data in the study, and had final responsibility for the decision to submit for publication.

A C K N O W L E D G E M E N T S
U-BIOPRED has received funding from the Innovative Medicines Initiative (IMI) Joint Undertaking under grant agreement no.115010, resources of which are composed of financial contributions from the European Union's Seventh Framework Programme (FP7/2007−2013), and European Federation of Pharmaceutical Industries and Associations (EFPIA) companies' in-kind contributions (www.imi.europa.eu).We acknowledge the contribution of the whole U-BIOPRED team as listed in the Supplemen-

D ATA AVA I L A B I L I T Y S TAT E M E N T
The metagenomic sequence data have been submitted to the NCBI under accession number PRJNA946921 and are accessible at the following link: https://www.ncbi.nlm.nih.gov/sra/PRJNA946921.

E T H I C S S TAT E M E N T
This study was approved by teh Ethics Commmittee of each participating clinical institution.All patients gave written informed consent to participate in the study.

F I G U R E 1
Flow chart showing the determination of relative dominant species (RDS) by α-diversity threshold in the sputum samples and applied to samples of severe asthma patients and hierarchical clustering of RDS samples.MMA/HC: mild-moderate asthma/Healthy controls; SA: severe asthma.The numbers studied in each cohort are shown.

F I G U R E 3
Diversity of Aitchison distance clusters of RDS samples.α-diversity using Shannon's index of each of the clusters C1 and C2 with non-RDS and MMA/HC groups (A).β-diversity of the clusters with non-RDS and MMA/HC groups (B).Species-level relative abundance of each of the 2 clusters compared to that of non-RDS and MMA/HC groups (C).

F I G U R E 4
Haemophilus influenzae component of C1 and C2 clusters of RDS samples.(A) Haemophilus influenzae abundance of C1 and C2 with non-RDS.(B) Sputum eosinophils and sputum neutrophils (%) of C1 and C2 with non-RDS.(C) Composition log fold-change between each of the C1 (i) and C2 (ii) clusters versus non-RDS group.RDS: relative dominant species of severe asthma groups; MMA/HC: mild-moderate asthma and healthy controls.

F I G U R E 5
Microbial pathway prevalence in C1 and C2 cluster samples.(A) Microbial pathway relative abundance of C1 and C2 clusters compared to that of non-RDS.(B) β-diversity of the relative abundant microbial pathways of the C1 and C2 clusters with non-RDS.

tary
Online repository file.KFC and IMA are funded by UK Research and Innovation (UKRI).KFC is Senior Investigator of the UK National Institute for Health Research (NIHR).Ali Versi was supported by BBSRC CASE award PhD studentship.C O N F L I C T O F I N T E R E S T S TAT E M E N TMr Versi has nothing to declare.Dr Azim reports employment through AstraZeneca.Dr Chotirmall has received lecture fees from Chiesi Farmaceutici and AstraZeneca, serves on advisory boards for Boehringer-Ingelheim, CSL Behring and Pneumagen Ltd. and is on Data and Safety Monitoring Boards (DSMB) for Inovio Pharmaceuticals all outside of the submitted work.Dr Maitland-van der Zee has received grants from Health Holland and she is the PI of a P4O2 (Precision Medicine for more Oxygen) public private partnership sponsored by Health Holland involving many private partners that contribute in cash and/or in kind (Boehringer Ingelheim, Breathomix, Fluidda, Ortec Logiqcare, Philips, Quantib-U, Smartfish, SODAQ, Thirona, TopMD and Novartis), received unrestricted research grants from GSK, Boehringer Ingelheim and Vertex, received consulting fees paid to her institution from Boehringer Ingelheim and AstraZeneca, and received honoraria for lectures paid to her institution from GlaxoSmithKline; outside the submitted work.Dr. Dahlén reports personal fees from AZ, Cayman Chemicals, GSK, Novartis, Regeneron, Sanofi, TEVA, outside the submitted work.Dr Chung has received honoraria for participating in Advisory Board meetings of Roche, Merck, Shionogi and Rickett-Beckinson and has also been remunerated for speaking engagements for Novartis and AZ.Dr Riley worked for and had shares in GSK.Dr. Bates reports to be currently an employee of Johnson & Johnson and to have previously worked and holds stock in GSK.Dr Uddin is an employee and holds shares in AstraZeneca.Dr Djukanovic declares consulting fees from Synairgen, Sanofi and Galapagos, lecture fees from GSK, AZ and Airways Vista and he holds shares from Synairgen.Dr Howarth is an employee of GSK.Dr Montuschi, Dr Kermani, Dr Adcock, Dr Ivan and Dr Abdel-Aziz have nothing to declare.

Table 1
Characteristics of clusters C1 and C2 and non-RDSs.